Towards Fully Bayesian Speaker Recognition: Integrating Out the Between-Speaker Covariance
نویسندگان
چکیده
We propose a variational Bayes solution to integrate out the model parameters in a generative i-vector speaker recognizer. The existing state-of-the-art in generative i-vector modelling plugs in fixed maximum-likelihood point-estimates of model parameters. This recipe may suffer from over-fitting of especially the between-speaker covariance. We show how to integrate out the between-speaker covariance and demonstrate dramatic improvements on NIST SRE 2010.
منابع مشابه
Towards domain independent speaker clustering
Speaker clustering is a key component in many speech processing applications. We focus on Broadcast News meta data annotation and speaker adaptation. In this setting, speaker clustering consists of identifying who spoke, and when they spoke in a long news broadcast. Speaker clustering is given a set of short audio segments. Ideally, it will discover how many people are speaking in the broadcast...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملFactor analysis for speaker segmentation and improved speaker diarization
Speaker diarization includes two steps: speaker segmentation and speaker clustering. Speaker segmentation searches for speaker boundaries, whereas speaker clustering aims at grouping speech segments of the same speaker. In this work, the segmentation is improved by replacing the Bayesian Information Criterion (BIC) with a new iVector-based approach. Unlike BIC-based methods which trigger on any...
متن کاملA Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures
This paper proposes a speaker recognition technique using multiple model structures based on the Bayesian approach. In recent speaker recognition, many sophisticated statistical models have been proposed, e.g., Joint Factor Analysis and i-Vector based method. However, since most of them are based on Gaussian Mixture Models (GMMs), therefore improving estimation accuracy of generative models, i....
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011